Mechanistic basis for maintenance of CHG DNA methylation in plants

DNA methylation is an evolutionarily conserved epigenetic mechanism essential for transposon silencing and heterochromatin assembly. In plants, DNA methylation widely occurs in the CG, CHG, and CHH (H = A, C, or T) contexts, with the maintenance of CHG methylation mediated by CMT3 chromomethylase. However, how CMT3 interacts with the chromatin environment for faithful maintenance of CHG methylation is unclear. Here we report structure-function characterization of the H3K9me2-directed maintenance of CHG methylation by CMT3 and its Zea mays ortholog ZMET2. Base-specific interactions and DNA deformation coordinately underpin the substrate specificity of CMT3 and ZMET2, while a bivalent readout of H3K9me2 and H3K18 allosterically stimulates substrate binding. Disruption of the interaction with DNA or H3K9me2/H3K18 led to loss of CMT3/ZMET2 activity in vitro and impairment of genome-wide CHG methylation in vivo. Together, our study uncovers how the intricate interplay of CMT3, repressive histone marks, and DNA sequence mediates heterochromatic CHG methylation.


Statistics
For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section.

n/a Confirmed
The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one-or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section.
A description of all covariates tested A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted Give P values as exact values whenever suitable.

For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings
For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated Our web collection on statistics for biologists contains articles on many of the points above.

Software and code
Policy information about availability of computer code Data collection X-ray diffraction data were collected using the standard data collection software from synchrotron beamline 24-ID-C, NE-CAT at Advanced Photo Source (APS). Illumina sequencing data was collected using the standard Illumina pipeline for the HiSeq4000 (BS-seq).
For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information.

Data
Policy information about availability of data All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: -Accession codes, unique identifiers, or web links for publicly available datasets -A list of figures that have associated raw data -A description of any restrictions on data availability Coordinates and structure factors for the ZMET2-hmCAG-H3Kc9me2 complex have been deposited in the Protein Data Bank under accession number 7UBU. The bisulfite-sequencing data has been deposited in NCBI Gene Expression Omnibus under accession number GSE180635 Field-specific reporting Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection.

Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences
For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf

Life sciences study design
All studies must disclose on these points even when the disclosure is negative.

Sample size
Biochemical and enzymatic assays were completed using wild type or mutants of ZMET2 and CMT3 fragments. Bisulfite-sequencing assays were completed using wild type or mutant CMT3 plasmids. The sample size is sufficient to delineate the mutational effects of ZMET2 and CMT3.
For plant experiments, sample size was chosen based on the standards commonly accepted in the field. For DNA/RNA extraction, usually 20-30 plants were taken for one single replicate. For western blot to quantify protein level, at least 15 seedlings were taken in one replicate.
Data exclusions No data exclusion.

Replication
For in vitro DNA methylation assays, two or three independent measurements were performed for each sample and stated in figure legends and/or method section. For transgenic plant lines, we generated at least three homozygous lines and usually present two replicates in the figures. For the T1 data of transgenic plants, three independent plants were used per transgene as indicated in figures. For McrBC-qPCR and RT-qPCR, two biological replicates with three technical replicates were performed with reproducible results. For BS-seq, two lines of H466A-FLAG transgenic plants and one line of Col-0, cmt3-11, and R745A-FLAG were used. Data are presented as the mean ± SD or mean ± SEM as indicated. Statistical analysis was performed with Student's t test for comparing two sets of data with assumed normal distribution. A p value of less than 0.05 was considered to be significant. All attempts at replication were successful.
Randomization The assays performed in this study require a rational approach for activity comparison. Therefore, randomization is not applicable to our experimental set up.

Blinding
Blinding is not applicable to any biochemical or cellular assay performed in this study.

Behavioural & social sciences study design
All studies must disclose on these points even when the disclosure is negative.

Study description
Briefly describe the study type including whether data are quantitative, qualitative, or mixed-methods (e.g. qualitative cross-sectional, quantitative experimental, mixed-methods case study).

Data collection
Provide details about the data collection procedure, including the instruments or devices used to record the data (e.g. pen and paper, computer, eye tracker, video or audio equipment) whether anyone was present besides the participant(s) and the researcher, and whether the researcher was blind to experimental condition and/or the study hypothesis during data collection.

Timing
Indicate the start and stop dates of data collection. If there is a gap between collection periods, state the dates for each sample cohort.

Data exclusions
If no data were excluded from the analyses, state so OR if data were excluded, provide the exact number of exclusions and the rationale behind them, indicating whether exclusion criteria were pre-established.

Non-participation
State how many participants dropped out/declined participation and the reason(s) given OR provide response rate OR state that no participants dropped out/declined participation.

Randomization
If participants were not allocated into experimental groups, state so OR describe how participants were allocated to groups, and if allocation was not random, describe how covariates were controlled.

Ecological, evolutionary & environmental sciences study design
All studies must disclose on these points even when the disclosure is negative.

Sampling strategy
Note the sampling procedure. Describe the statistical methods that were used to predetermine sample size OR if no sample-size calculation was performed, describe how sample sizes were chosen and provide a rationale for why these sample sizes are sufficient.

Data collection
Describe the data collection procedure, including who recorded the data and how.
Timing and spatial scale Indicate the start and stop dates of data collection, noting the frequency and periodicity of sampling and providing a rationale for these choices. If there is a gap between collection periods, state the dates for each sample cohort. Specify the spatial scale from which the data are taken

Data exclusions
If no data were excluded from the analyses, state so OR if data were excluded, describe the exclusions and the rationale behind them, indicating whether exclusion criteria were pre-established.

Reproducibility
Describe the measures taken to verify the reproducibility of experimental findings. For each experiment, note whether any attempts to repeat the experiment failed OR state that all attempts to repeat the experiment were successful.

Randomization
Describe how samples/organisms/participants were allocated into groups. If allocation was not random, describe how covariates were controlled. If this is not relevant to your study, explain why.

Blinding
Describe the extent of blinding used during data acquisition and analysis. If blinding was not possible, describe why OR explain why blinding was not relevant to your study.
Did the study involve field work?

Yes No
Field work, collection and transport

Field conditions
Describe the study conditions for field work, providing relevant parameters (e.g. temperature, rainfall).

Location
State the location of the sampling or experiment, providing relevant parameters (e.g. latitude and longitude, elevation, water depth).

Access and import/export
Describe the efforts you have made to access habitats and to collect and import/export your samples in a responsible manner and in compliance with local, national and international laws, noting any permits that were obtained (give the name of the issuing authority, the date of issue, and any identifying information).

Disturbance
Describe any disturbance caused by the study and how it was minimized.

Reporting for specific materials, systems and methods
We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response.

Authentication
Describe the authentication procedures for each cell line used OR declare that none of the cell lines used were authenticated.

Mycoplasma contamination
Confirm that all cell lines tested negative for mycoplasma contamination OR describe the results of the testing for mycoplasma contamination OR declare that the cell lines were not tested for mycoplasma contamination.

Commonly misidentified lines (See ICLAC register)
Name any commonly misidentified cell lines used in the study and provide a rationale for their use.

Palaeontology Specimen provenance
Provide provenance information for specimens and describe permits that were obtained for the work (including the name of the issuing authority, the date of issue, and any identifying information).

Specimen deposition
Indicate where the specimens have been deposited to permit free access by other researchers.

Dating methods
If new dates are provided, describe how they were obtained (e.g. collection, storage, sample pretreatment and measurement), where they were obtained (i.e. lab name), the calibration program and the protocol for quality assurance OR state that no new dates are provided.
Tick this box to confirm that the raw and calibrated dates are available in the paper or in Supplementary Information.

Animals and other organisms
Policy information about studies involving animals; ARRIVE guidelines recommended for reporting animal research

Laboratory animals
For laboratory animals, report species, strain, sex and age OR state that the study did not involve laboratory animals.

Wild animals
Provide details on animals observed in or captured in the field; report species, sex and age where possible. Describe how animals were caught and transported and what happened to captive animals after the study (if killed, explain why and describe method; if released, say where and when) OR state that the study did not involve wild animals.

Field-collected samples
For laboratory work with field-collected samples, describe all relevant parameters such as housing, maintenance, temperature, photoperiod and end-of-experiment protocol OR state that the study did not involve samples collected from the field.

Ethics oversight
Identify the organization(s) that approved or provided guidance on the study protocol, OR state that no ethical approval or guidance was required and explain why not.
Note that full information on the approval of the study protocol must also be provided in the manuscript.